<!-- #######  YAY, I AM THE SOURCE EDITOR! #########-->
<h2><strong>Audio samples from &ldquo;Towards Natural Bilingual and Code-Switching Speech Synthesis Based on Mix of Monolingual Data and Cross-Lingual Voice Conversion&rdquo;</strong></h2>
<p><strong>Authors:</strong> Shengkui Zhao, Trung Hieu Nguyen, Hao Wang, Bin Ma</p>

<p>&nbsp;</p>
<h3><strong>The English target speaker&rsquo;s voices from professional records:</strong></h3>
<p><em>text: I think it could be something to do with the soil and the climate.</em></p>
<p><em><video controls="controls" width="300" height="50">
<source src="demo/tts/TTS_outs/tgt_voice_EN_F/tgt_EN_f_010027.wav" /></video></em></p>
<p><em>text: We are very concerned that it will not happen and we will be engaged.</em></p>
<p><em><video controls="controls" width="300" height="50">
<source src="demo/tts/TTS_outs/tgt_voice_EN_F/tgt_EN_f_010033.wav" /></video></em></p>
<p><em>text: They arrive at the door of the food bank from all over the city.</em></p>
<p><em><video controls="controls" width="300" height="50">
<source src="demo/tts/TTS_outs/tgt_voice_EN_F/tgt_EN_f_010036.wav" /></video></em></p>
<h3><strong>The Mandarin target speaker&rsquo;s voices from professional records:</strong></h3>
<p><em>text: 五千年传统文化信手拈来，被涂脂抹粉，戏谑调侃。 </em></p>
<p><em><video controls="controls" width="300" height="50">
<source src="demo/tts/TTS_outs/tgt_voice_CN_F/tgt_CN_f_000028.wav" /></video></em></p>
<p><em>text: 今天，和平广场里盛开着这样的黄玫瑰。</em></p>
<p><em><video controls="controls" width="300" height="50">
<source src="demo/tts/TTS_outs/tgt_voice_CN_F/tgt_CN_f_000031.wav" /></video></em></p>
<p><em>text: 他说，他会将车收藏一辈子，以后再传给儿女们。 </em></p>
<p><em><video controls="controls" width="300" height="50">
<source src="demo/tts/TTS_outs/tgt_voice_CN_F/tgt_CN_f_000037.wav" /></video></em></p>
<h2><strong>Cross-lingual voice conversion: </strong></h2>
<h3><strong>Source speaker: Mandarin female;&nbsp;</strong><strong>Target speaker: English female</strong></h3>
<p><em>text: 五千年传统文化信手拈来，被涂脂抹粉，戏谑调侃。</em></p>
<table style="height: 99px; width: 656px;">
<tbody>
<tr style="height: 15.1667px;">
<td style="width: 301.667px; text-align: center; height: 15.1667px;">Source speaker</td>
<td style="width: 10px; text-align: center; height: 15.1667px;">Target Speaker</td>
<td style="width: 114px; text-align: center; height: 15.1667px;">Tacotron2-VC</td>
<td style="width: 114px; height: 15.1667px;">&nbsp;</td>
</tr>
<tr style="height: 58px;">
<td style="width: 301.667px; height: 58px;"><video style="font-size: 14px; background-image: url('img/object.gif');" controls="controls" width="300" height="50">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/src_CN_f_000028.wav" /></video></td>
<td style="width: 10px; height: 58px;"><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/tgt_voice/tgt_EN_f_010033.wav" /></video></td>
<td style="width: 114px; height: 58px;"><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/TA_VC_000028_conv.wav" /></video></td>
<td style="width: 114px; height: 58px;">&nbsp;</td>
</tr>
</tbody>
</table>
<p><em>text: 他们家靠这个药，世代漂洗为生，日子过得艰难。</em></p>
<table style="height: 99px; width: 656px;">
<tbody>
<tr style="height: 15.1667px;">
<td style="width: 301.667px; text-align: center; height: 15.1667px;">Source speaker</td>
<td style="width: 10px; text-align: center; height: 15.1667px;">Target Speaker</td>
<td style="width: 114px; text-align: center; height: 15.1667px;">Tacotron2-VC</td>
<td style="width: 114px; height: 15.1667px;">&nbsp;</td>
</tr>
<tr style="height: 58px;">
<td style="width: 301.667px; height: 58px;"><video style="font-size: 14px; background-image: url('img/object.gif');" controls="controls" width="300" height="50">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/src_CN_f_000033.wav" /></video></td>
<td style="width: 10px; height: 58px;"><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/tgt_voice/tgt_EN_f_010027.wav" /></video></td>
<td style="width: 114px; height: 58px;"><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/TA_VC_000033_conv.wav" /></video></td>
<td style="width: 114px; height: 58px;">&nbsp;</td>
</tr>
</tbody>
</table>
<p><em>text: 众所周知，潍坊因风筝而在全国闻名遐迩。</em></p>
<table style="height: 99px; width: 656px;">
<tbody>
<tr style="height: 15.1667px;">
<td style="width: 301.667px; text-align: center; height: 15.1667px;">Source speaker</td>
<td style="width: 10px; text-align: center; height: 15.1667px;">Target Speaker</td>
<td style="width: 114px; text-align: center; height: 15.1667px;">Tacotron2-VC</td>
<td style="width: 114px; height: 15.1667px;">&nbsp;</td>
</tr>
<tr style="height: 58px;">
<td style="width: 301.667px; height: 58px;"><video style="font-size: 14px; background-image: url('img/object.gif');" controls="controls" width="300" height="50">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/src_CN_f_000040.wav" /></video></td>
<td style="width: 10px; height: 58px;"><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/tgt_voice/tgt_EN_f_010036.wav" /></video></td>
<td style="width: 114px; height: 58px;"><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_CN_F_tgt_EN_F/TA_VC_000040_conv.wav" /></video></td>
<td style="width: 114px; height: 58px;">&nbsp;</td>
</tr>
</tbody>
</table>
<h3><strong>Source speaker: English female; Target speaker:&nbsp;</strong><strong>Mandarin female</strong></h3>
<p><em>text: Hazel would like to sell the business.</em></p>
<table>
<tbody>
<tr>
<td style="text-align: center;">Source speaker</td>
<td style="text-align: center;">Target Speaker</td>
<td style="text-align: center;">Tacotron-VC</td>
<td>&nbsp;</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/src_EN_f_010028.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/tgt_voice/tgt_CN_f_000037.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/TA_VC_010028_conv.wav" /></video></td>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<p><em>text: Many people have lost their jobs altogether.</em></p>
<table>
<tbody>
<tr>
<td style="text-align: center;">Source speaker</td>
<td style="text-align: center;">Target Speaker</td>
<td style="text-align: center;">Tacotron-VC</td>
<td>&nbsp;</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/src_EN_f_010034.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/tgt_voice/tgt_CN_f_000031.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/TA_VC_010034_conv.wav" /></video></td>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<p><em>text: This is a serious accident, and we will do our utmost to identify the cause.</em></p>
<table>
<tbody>
<tr>
<td style="text-align: center;">Source speaker</td>
<td style="text-align: center;">Target Speaker</td>
<td style="text-align: center;">Tacotron-VC</td>
<td>&nbsp;</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/src_EN_f_010041.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/tgt_voice/tgt_CN_f_000028.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/VC_outs/TA_VC_src_EN_F_tgt_CN_F/TA_VC_010041_conv.wav" /></video></td>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<h2><strong>Bilingual and code-switching speech synthesis:</strong></h2>
<h3><strong>English input text (all text not seen in training set)</strong></h3>
<p><em>text: A microscopic water creature could live until the end of the Earth.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_EN_F_017.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_EN_F_017.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_EN_F_017.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_CN_F_017.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_CN_F_017.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_CN_F_017.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>text: Christmas is widely celebrated and enjoyed across the United States and the world.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_EN_F_010.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_EN_F_010.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_EN_F_010.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_CN_F_010.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_CN_F_010.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_CN_F_010.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>text: Many lessons are boring, and he is very tired after doing gym.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_EN_F_007.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_EN_F_007.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_EN_F_007.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_CN_F_007.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_CN_F_007.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_CN_F_007.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>text: Besides carving pumpkins, some celebrate Halloween by putting decorations up.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_EN_F_020.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_EN_F_020.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_EN_F_020.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/FS_CN_F_020.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TA_CN_F_020.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/English/TR_CN_F_020.wav" /></video></td>
</tr>
</tbody>
</table>
<h3><strong>Chinese input text (all text not seen in training set)</strong></h3>
<p><em>Text: 儿子一气之下没有去，靠自学考上了函授大学。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_EN_F_8.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_EN_F_8.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_EN_F_8.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_CN_F_8.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_CN_F_8.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_CN_F_8.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 目前您的电话接入后可能存在声音不清晰的情况。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_EN_F_44.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_EN_F_44.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_EN_F_44.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_CN_F_44.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_CN_F_44.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_CN_F_44.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 就在她伸手想拿起那个红通通的果子试吃的时候，肩膀突然被人轻轻一拍。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_EN_F_66.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_EN_F_66.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_EN_F_66.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_CN_F_66.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_CN_F_66.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_CN_F_66.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 现在看天上的星星总是觉得没有小时候看到的多。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_EN_F_85.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_EN_F_85.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_EN_F_85.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/FS_CN_F_85.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TA_CN_F_85.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Chinese/TR_CN_F_85.wav" /></video></td>
</tr>
</tbody>
</table>
<h3><strong>Code-switching input text (all text not seen in training set)</strong></h3>
<p><em>Text: 我刚刚去 Starbucks 买了杯 Vanilla Latte 和两块 Oatmeal Raisin Cookie, 搭配起来还蛮不错的。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_108.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_108.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_108.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_108.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_108.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_108.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: Brunch 这个词是 breakfast 和 lunch 两个词的结合，意思是“早午餐。”</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_105.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_105.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_105.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_105.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_105.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_105.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 《life of Pi》的中文名字是《少年派的奇幻漂流》。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_107.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_107.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_107.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_107.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_107.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_107.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 这个源于拳击运动的表达 “to punch above your weight” 的本意是“能和高于自己重量级别的对手较量。”</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_111.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_111.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_111.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_111.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_111.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_111.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 一见钟情 is similar to 一见倾心 which means love at first sight.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_113.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_113.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_113.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_113.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_113.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_113.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: 你多吃一点 means “Have some more.” 而慢慢吃 expresses politeness to someone when eating.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_116.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_116.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_116.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_116.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_116.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_116.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: When you wish to raise your drink to someone, to drink with them or propose a toast, you can say我敬你一杯。</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_117.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_117.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_117.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_117.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_117.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_117.wav" /></video></td>
</tr>
</tbody>
</table>
<p><em>Text: The “闻” in “百闻不如一见” does not refer to smelling, but rather means to hear of, such as news, or by word of mouth.</em></p>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<p style="text-align: center;"><strong>&nbsp;Target Speaker: English</strong></p>
</td>
<td>&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_EN_F_120.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_EN_F_120.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_EN_F_120.wav" /></video></td>
</tr>
</tbody>
</table>
<table width="774">
<tbody>
<tr>
<td>&nbsp;</td>
<td style="text-align: center;">
<p><strong>Target Speaker: Mandarin</strong></p>
</td>
<td style="text-align: center;">&nbsp;</td>
</tr>
<tr>
<td style="text-align: center;">FastSpeech</td>
<td style="text-align: center;">Tacotron2</td>
<td style="text-align: center;">Transformer</td>
</tr>
<tr>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/FS_CN_F_120.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TA_CN_F_120.wav" /></video></td>
<td><video controls="controls" width="300" height="50" data-mce-fragment="1">
<source src="demo/tts/TTS_outs/Code-Switching/TR_CN_F_120.wav" /></video></td>
</tr>
</tbody>
</table>